Parallel Computation of RBF Kernels for Support Vector Classifiers
نویسندگان
چکیده
While kernel support vector machines are powerful classification algorithms, their computational overhead can be significant, especially for large and high-dimensional data sets. A recent biomedical dataset, for instance, could take as long as 3 weeks to compute its RBF kernel matrix on a modern, single-processor workstation. In this paper, we develop methods for high-performance parallel computation of kernel matrices. There are two key components to a parallel implementation: distribution of the computation across nodes and communication to combine the results. To address the first, we employ a dimension-wise data partition that yields efficient computation and low communication overhead during the initial phase. This partition provides dramatic speedups on large and high-dimensional data, applies to a wide variety of kernel functions, and is an exact computation, producing the same kernel matrix as its sequential implementation. To address communication needs during the second phase, we introduce an approximation specific to the Gaussian RBF kernel that yields sparse partial kernel matrices and, thus, efficient communication. We analyze the approximation error of this method, demonstrating that it falls off exponentially with N , the parameter of the approximation. We also examine the positive definiteness of the approximation with respect to Mercer’s condition and show that (a) in the limit of N our approximation becomes positive definite for any data set and (b) for a fixed data set, there exists a finite N yielding a positive definite kernel matrix. We also give a simple iterative method for selecting N to yield a positive definite kernel matrix on any fixed data set. In practice, we find that positive definiteness is achieved on all of the data sets we examine with very small N (2–5). Finally, we test the empirical performance of our two methods on a variety of large, real-world data sets, demonstrating large computational speedups with little or no impact on accuracy.
منابع مشابه
Compactly Supported Radial Basis Function Kernels
The use of kernels is a key factor in the success of many classification algorithms by allowing nonlinear decision surfaces. Radial basis function (RBF) kernels are commonly used but often associated with dense Gram matrices. We consider a mathematical operator to sparsify any RBF kernel systematically, yielding a kernel with a compact support and sparse Gram matrix. Having many zero elements i...
متن کاملBrain MRI Slices Classification Using Least Squares Support Vector Machine
This research paper proposes an intelligent classification technique to identify normal and abnormal slices of brain MRI data. The manual interpretation of tumor slices based on visual examination by radiologist/physician may lead to missing diagnosis when a large number of MRIs are analyzed. To avoid the human error, an automated intelligent classification system is proposed which caters the n...
متن کاملSupport Vector Regression Using Mahalanobis Kernels
In our previous work we have shown that Mahalanobis kernels are useful for support vector classifiers both from generalization ability and model selection speed. In this paper we propose using Mahalanobis kernels for function approximation. We determine the covariance matrix for the Mahalanobis kernel using all the training data. Model selection is done by line search. Namely, first the margin ...
متن کاملKernels for Longitudinal Data with Variable Sequence Length and Sampling Intervals
We develop several kernel methods for classification of longitudinal data and apply them to detect cognitive decline in the elderly. We first develop mixed-effects models, a type of hierarchical empirical Bayes generative models, for the time series. After demonstrating their utility in likelihood ratio classifiers (and the improvement over standard regression models for such classifiers), we d...
متن کاملComparing support vector machines with Gaussian kernels to radial basis function classifiers
The support vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights, and threshold that minimize an upper bound on the expected test error. The present study is devote...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005